Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Graph Neural Networks (GNNs) are a popular machine learning framework for solving various graph processing applications. This framework exploits both the graph topology and the feature vectors of the nodes. One of the important applications of GNN is in the semi-supervised node classification task. The accuracy of the node classification using GNN depends on (i) the number and (ii) the choice of the training nodes. In this article, we demonstrate that increasing the training nodes by selecting nodes from the same class that are spread out across non-contiguous subgraphs, can significantly improve the accuracy. We accomplish this by presenting a novel input intervention technique that can be used in conjunction with different GNN classification methods to increase the non-contiguous training nodes and, thereby, improve the accuracy. We also present an output intervention technique to identify misclassified nodes and relabel them with their potentially correct labels. We demonstrate on real-world networks that our proposed methods, both individually and collectively, significantly improve the accuracy in comparison to the baseline GNN algorithms. Both our methods are agnostic. Apart from the initial set of training nodes generated by the baseline GNN methods, our techniques do not need any other extra knowledge about the classes of the nodes. Thus, our methods are modular and can be used as pre-and post-processing steps with many of the currently available GNN methods to improve their accuracy.more » « less
-
Coarse-grained (CG) molecular dynamics can be a powerful method for probing complex processes. However, most CG force fields use pairwise nonbonded interaction potentials sets, which can limit their ability to capture complex multi-body phenomena such as the hydrophobic effect. As the hydrophobic effect primarily manifests itself due to the nonpolar solute affecting the nearby hydrogen bonding network in water, capturing such effects using a simple one CG site or “bead” water model is a challenge. In this work, we systematically test the ability of CG one site water models for capturing critical features of the solvent environment around a hydrophobe as well as the potential of mean force (PMF) of neopentane association. We study two bottom-up models: a simple pairwise (SP) force-matched water model constructed using the multiscale coarse-graining method and the Bottom-Up Many-Body Projected Water (BUMPer) model, which has implicit three-body correlations. We also test the top-down monatomic (mW) and the Machine Learned mW (ML-mW) water models. The mW models perform well in capturing structural correlations but not the energetics of the PMF. BUMPer outperforms SP in capturing structural correlations and also gives an accurate PMF in contrast to the two mW models. Our study highlights the importance of including three-body interactions in CG water models, either explicitly or implicitly, while in general highlighting the applicability of bottom-up CG water models for studying hydrophobic effects in a quantitative fashion. This assertion comes with a caveat, however, regarding the accuracy of the enthalpy–entropy decomposition of the PMF of hydrophobe association.more » « less
-
Constant communities, i.e., groups of vertices that are always clustered together, independent of the community detection algorithm used, are necessary for reducing the inherent stochasticity of community detection results. Current methods for identifying constant communities require multiple runs of community detection algorithm(s). This process is extremely time consuming and not scalable to large networks. We propose a novel approach for finding the constant communities, by transforming the problem to a binary classification of edges. We apply the Otsu method from image thresholding to classify edges based on whether they are always within a community or not. Our algorithm does not require any explicit detection of communities and can thus scale to very large networks of the order of millions of vertices. Our results on real-world graphs show that our method is significantly faster and the constant communities produced have higher accuracy (as per F1 and NMI scores) than state-of-the-art baseline methods.more » « less
An official website of the United States government
